Universal Weak Variable-Length Source Coding on Countable Infinite Alphabets

نویسندگان

Jorge F. Silva

Pablo Piantanida

چکیده

Motivated from the fact that universal source coding on countably infinite alphabets is not feasible, this work introduces the notion of “almost lossless source coding”. Analog to the weak variable-length source coding problem studied by Han [3], almost lossless source coding aims at relaxing the lossless block-wise assumption to allow an average per-letter distortion that vanishes asymptotically as the blocklength goes to infinity. In this setup, we show on one hand that Shannon entropy characterizes the minimum achievable rate (similarly to the case of discrete sources) while on the other that almost lossless universal source coding becomes feasible for the family of finite-entropy stationary memoryless sources with countably infinite alphabets. Furthermore, we study a stronger notion of almost lossless universality that demands uniform convergence of the average per-letter distortion to zero, where we establish a necessary and sufficient condition for the so-called family of “envelope distributions” to achieve it. Remarkably, this condition is the same necessary and sufficient condition needed for the existence of a strongly minimax (lossless) universal source code for the family of envelope distributions. Finally, we show that an almost lossless coding scheme offers faster rate of convergence for the (minimax) redundancy compared to the well-known information radius developed for the lossless case at the expense of tolerating a non-zero distortion that vanishes to zero as the block-length grows. This shows that even when lossless universality is feasible, an almost lossless scheme can offer different regimes on the rates of convergence of the (worst case) redundancy versus the (worst case) distortion. The material in this paper was partially published in The IEEE 2016 [1] and IEEE 2017 [2] International Symposium on Information Theory (ISIT). J. F. Silva is with the Information and Decision Systems (IDS) Group, University of Chile, Av. Tupper 2007 Santiago, 412-3, Room 508, Chile, Tel: 56-2-9784090, Fax: 56-2 -6953881, (email: [email protected]). P. Piantanida is with the Laboratoire des Signaux et Systèmes (L2S), CentraleSupélec-CNRS-Université Paris-Sud, Gif-surYvette, France, (email: [email protected]). August 29, 2017 DRAFT

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Optimal Coding

Novel coding schemes are introduced and relationships between optimal codes and Huffman codes are discussed. It is shown that, for finite source alphabets, the Huffman coding is the optimal coding, and conversely the optimal coding needs not to be the Huffman coding. It is also proven that there always exists the optimal coding for infinite source alphabets. We show that for every random variab...

متن کامل

Universal variable-to-fixed length source codes

A universal variable-to-fixed length algorithm for binary memoryless sources which converges to the entropy of the source at the optimal rate is known. We study the problem of universal variable-to-fixed length coding for the class of Markov sources with finite alphabets. We give an upper bound on the performance of the code for large dictionary sizes and show that the code is optimal in the se...

متن کامل

Pattern Coding Meets Censoring: (almost) Adaptive Coding on Countable Alphabets

Adaptive coding faces the following problem: given a collection of source classes such that each class in the collection has non-trivial minimax redundancy rate, can we design a single code which is asymptotically minimax over each class in the collection? In particular, adaptive coding makes sense when there is no universal code on the union of classes in the collection. In this paper, we deal...

متن کامل

A vector quantization approach to universal noiseless coding and quantization

A two-stage code is a block code in which each block of data is coded in two stages: the first stage codes the identity of a block code among a collection of codes, and the second stage codes the data using the identified code. The collection of codes may be noiseless codes, fixed-rate quantizers, or variable-rate quantizers. We take a vector quantization approach to two-stage coding, in which ...

متن کامل

[hal-00665033, v1] About adaptive coding on countable alphabets

This paper sheds light on universal coding with respect to classes of memoryless sources over a countable alphabet defined by an envelope function with finite and non-decreasing hazard rate. We prove that the auto-censuring (AC) code introduced by Bontemps (2011) is adaptive with respect to the collection of such classes. The analysis builds on the tight characterization of universal redundancy...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1708.08103 شماره

صفحات -

تاریخ انتشار 2017

Universal Weak Variable-Length Source Coding on Countable Infinite Alphabets

نویسندگان

چکیده

منابع مشابه

On the Optimal Coding

Universal variable-to-fixed length source codes

Pattern Coding Meets Censoring: (almost) Adaptive Coding on Countable Alphabets

A vector quantization approach to universal noiseless coding and quantization

[hal-00665033, v1] About adaptive coding on countable alphabets

عنوان ژورنال:

اشتراک گذاری